Identification of Informative Genes for Molecular Classification Using Probabilistic Model Building Genetic Algorithm
نویسندگان
چکیده
DNA microarray allows the monitoring and measurement of the expression levels of thousands of genes simultaneously in an organism. A systematic and computational analysis of this vast amount of data provides understanding and insight into many aspects of biological processes. Recently, there has been a growing interest in classification of patient samples based on these gene expressions. The main challenge here is the overwhelming number of genes relative to the number of available training samples in the data set, and many of these genes are irrelevant for classification and have negative effect on the accuracy of the classifier. The choice of genes affects several aspects of classification: accuracy, required learning time, cost, and number of training samples needed. In this paper, we propose a new Probabilistic Model Building Genetic Algorithm (PMBGA) for the identification of informative genes for molecular classification and present our unbiased experimental results on three bench-mark data sets.
منابع مشابه
Identification of Alzheimer disease-relevant genes using a novel hybrid method
Identifying genes underlying complex diseases/traits that generally involve multiple etiological mechanisms and contributing genes is difficult. Although microarray technology has enabled researchers to investigate gene expression changes, but identifying pathobiologically relevant genes remains a challenge. To address this challenge, we apply a new method for selecting the disease-relevant gen...
متن کاملModeling gene regulatory networks: Classical models, optimal perturbation for identification of network
Deep understanding of molecular biology has allowed emergence of new technologies like DNA decryption. On the other hand, advancements of molecular biology have made manipulation of genetic systems simpler than ever; this promises extraordinary progress in biological, medical and biotechnological applications. This is not an unrealistic goal since genes which are regulated by gene regulatory ...
متن کاملFeature Selection Using Multi Objective Genetic Algorithm with Support Vector Machine
Different approaches have been proposed for feature selection to obtain suitable features subset among all features. These methods search feature space for feature subsets which satisfies some criteria or optimizes several objective functions. The objective functions are divided into two main groups: filter and wrapper methods. In filter methods, features subsets are selected due to some measu...
متن کاملOptimal Policy of Condition-Based Maintenance Considering Probabilistic Logistic Times and the Environmental Contamination Issues
One important issue in using the industrial and manufacturing facilities is to consider their impacts on the environment. Many of these equipment have negative effects on the environment when operated for sometimes. In this research the probabilistic logistic times and the destructive effects for maintenance are studied and appropriate model is suggested and evaluated. The aim of the proposed m...
متن کاملSFLA Based Gene Selection Approach for Improving Cancer Classification Accuracy
In this paper, we propose a new gene selection algorithm based on Shuffled Frog Leaping Algorithm that is called SFLA-FS. The proposed algorithm is used for improving cancer classification accuracy. Most of the biological datasets such as cancer datasets have a large number of genes and few samples. However, most of these genes are not usable in some tasks for example in cancer classification....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004